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DETAILED ACTION 
Claim Rejections - 35 USC § 103 

1 . The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

2. Claims 1 , 3, 5-8, and 1 1 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Junqua et al. (US 6415257) in view of Allinger (DE 19747745). 

3. Regarding claims 1 and 7, Junqua et al. disclose a dialog system and method 
comprising processing units for 

automatically speech recognition (speech recognizer 12 in figure 1), 
natural language understanding (elements 24 and 30 in figurel), 
generating visual system outputs (col. 10 lines 65 to col. 11, lines 3 and/or 

element 36 of figure 1), 

deriving user models from determined details about a style of speech of user 

inputs (the process of figure 7, training the new user using speech characteristics/style 

of the new user); and 

Junqua et al. fail to specifically disclose adaptation of system outputs in 

dependence on the derived user models, wherein the system outputs are adapted to the 

style of the speech of the user inputs including at least two of a colloquial language, 
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standard language, dialect. However, Allenger teach adaptation of system outputs in 
dependence on the derived user models, wherein the system outputs are adapted to the 
style of the speech of the user inputs including at least two of a colloquial language, 
standard language, dialect {page 6, lines 15-1 6 shows speech recognition capability, 
and page 7, lines 1-32, outputs are adapted in content based on user's input; and the 
output language is in standard language and colloquial language, which is the defined 
by dictionary.com as "involving or using conversation'). 

Since the modified Junqua et al. and Allenger are analogous art because they 
are from the same field of endeavor, namely speech recognition, it would have been 
obvious to one of ordinary skill in the art at the time of invention to modify Junqua et al. 
by incorporating the teaching of Allenger in order to provide shorter responses to a 
more experience user. This would reduce user's frustration, and hence improve the 
effectiveness of the system. 

4. Regarding claim 8, Junqua et al. disclose a process for television-user dialog, 
comprising the steps of: 

receiving user speech input (element 10 in figure 1); 

processing the speech input using automatic speech recognition and natural 
language understanding {elements 12 and 24 in figure 1); and 

defining at least one system output based on the speech input and a user model 
derived from details of the user style of speech inputs (col. 2, lines 54 to col. 3, line 67, 
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speech/speaker adaptation; and output response to the user if the input speech 
command is recognized and the correct response is found). 

Junqua et al. fail to specifically disclose defining at least one system output 
based on the speech input and a user model derived from details of a style of the 
speech input, wherein the at least one system output in content is based on the style of 
the speech input including at least two of a colloquial language, standard language, 
dialect. However, Allenger teach defining at least one system output based on the 
speech input and a user model derived from details of a style of the speech input, 
wherein the at least one system output in content is based on the style of the speech 
input including at least two of a colloquial language, standard language, dialect (page 6, 
lines 15-16 shows speech recognition capability, and page 7, lines 1-32, outputs are 
adapted in content based on user's input; and the output language is in standard 
language and colloquial language, which is the defined by dictionary.com as "involving 
or using conversation'). 

Since the modified Junqua et al. and Allenger are analogous art because they 
are from the same field of endeavor, namely speech recognition, it would have been 
obvious to one of ordinary skill in the art at the time of invention to modify Junqua et al. 
by incorporating the teaching of Allenger in order to provide shorter responses to a 
more experience user. This would reduce user's frustration, and hence improve the 
effectiveness of the system. 
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5. Regarding claim 3, Junqua et al. further disclose a dialog system characterized in 
that the user models contain estimates for the reliability of recognition results derived 
from user inputs (col. 7, In. 1-32, the score associated with each candidate represents 
the reliability of each recognized candidate). 

6. Regarding claim 5, Junqua et al. further disclose a dialog system characterized in 
that fixed models of user stereotypes are used for forming the user models (col. 8, In. 8- 
26, a speaker adaptation process). 

7. Regarding claim 6, Junqua et al. further disclose a dialog system characterized in 
that user models are used which are continuously updated based on inputs of the 
respective user (col. 3, In. 1-27, the system includes a usage log recording user's 
everyday uses of the system). 

8. Regarding claim 1 1 , Junqua et al. further disclose the process of Claim 8, 
wherein the step of defining comprises the step of: defining at least one system output 
based on the speech input and a user model which includes a familiarity level, wherein 
the system output is based on the familiarity level {col. 3, lines 1-25, familiarity level is 
determined by how often and/or how long the user has used the system and that is 
specified in the usage log). 
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9. Claims 2, 4, and 10 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Junqua et al. (US 6415257) in view of Allinger (DE 19747745), as applied to claims 
1 and 8, and further in view of Larsen (IEEE Publication). 

10. Regarding claim 2, Junqua et al. further disclose a dialog system characterized in 
that in addition to the input modality to use user inputs by means of speech, at least a 
further input modality is provided (col. 3, In. 35-44). Junqua et al. do not disclose a 
dialog system characterized in that the user models contain details about the respective 
use of the various input modalities by the user. 

However, Larsen teaches a bi-modal application used in a dialog system, where 
a DTMF input mode is used if repeated recognition errors occur in the speech 
recognition mode (referring to APPLICATION SECTION on pages 66-67). The 
advantage of using the teaching of Larsen in Junqua et al. is to enable the system to 
take appropriate actions to process the input signal to achieve high accuracy. 

Since Junqua et al. and Larsen are analogous art because they are from the 
same field of endeavors it would have been obvious to one of ordinary skill in the art at 
the time of invention to modify Junqua et al. by incorporating the teaching of Larsen in 
order to enable the system to take appropriate actions to process the input signal to 
achieve high accuracy. 

The modified Junqua et al. still fail to disclose a dialog system characterized in 
that the user models contain details about the respective use of the various input 
modalities by the user. However, it would have been obvious to one of ordinary skill in 
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the art at the time of invention to readily realize that both DTMF and speech input 
modes, as taught by Larsen, are different and both are represented by two distinct 
signals. Therefore, the system would have distinguished and processed these two 
signals differently in order to enhance the system's efficiency and reliability. 

1 1 Regarding claim 4, Junqua et al. do not disclose a dialog system characterized in 
that in dependence on the estimates, system responses are generated which prompt 
the respective user to use such input modalities for which high estimate values were 
determined and/or which prevent the respective user from using input modalities for 
which low reliability values were determined. 

However, Larsen teaches a dialog system characterized in that in dependence 
on the estimates, system responses are generated which prompt the respective user to 
use such input modalities for which high estimate values were determined and/or which 
prevent the respective user from using input modalities for which low reliability values 
were determined (referring to APPLICATION SECTION on pages 66-67). The 
advantage of using the teaching of Larsen in the modified Junqua et al. is to allow the 
system to switch to a different input mode to achieve high recognition accuracy. 

Since the modified Junqua et al. and Larsen are analogous art because they are 
from the same field of endeavors, it would have been obvious to one of ordinary skill in 
the art at the time of invention to further modify Junqua et al. by incorporating the 
teaching of Larsen in order to allow the system to switch to a different input mode to 
achieve high recognition accuracy. 
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12. Regarding claim 1 0, Junqua et al. further teach the process of Claim 8, wherein 
the step of defining comprises the step of: defining at least one system output based on 
the speech input and a user model, wherein the system output is based on the likely 
input modality (col. 3, lines 1-67). Junqua et al. fail to specifically disclose a user model, 
which includes a likely input modality for a current prompt. However, Larsen teaches a 
user model, which includes a likely input modality for a current prompt (referring to 
APPLICA TION SECTION on pages 66-67). 

Since Junqua et al. and Larsen are analogous art because they are from the 
same field of endeavors it would have been obvious to one of ordinary skill in the art at 
the time of invention to modify Junqua et al. by incorporating the teaching of Larsen in 
order to enable the system to take appropriate actions to process the input signal to 
achieve high accuracy. 

13. Claim 12 is rejected under 35 U.S.C. 103(a) as being unpatentable over Junqua 
et al. (US 6415257) in view of Allinger (DE 19747745), as applied to claim 8, and further 
in view of Toyama et al. (US 6502082). 

14. Regarding claim 12, Junqua et al. fails to specifically disclose the process of 
claim 8 further comprising the steps of: receiving a user face image: and determining a 
degree of despair based on the user face image (col. 1, lines 38-54); wherein the step 
of defining comprises the step of: defining at least one system output based on the 
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